Edges weighted with the combined score generated by the STRING database will be useful for comparison against our own method and to test the community detection analysis before the weighted edges generated using our method are ready. Two options exist to get these weightings:

Use the STRING online service
Parsing the STRING summary features we have already extracted in this notebook

Unfortunately, the online service produces a table that does not include the Entrez IDs that are originally put in, so the output would have to be mapped back to Entrez IDs for our pipeline. The fastest way will be to use the pickled object created in the above notebook to generate features and take only the combined values:



In [1]:

    
cd ../../features









    



/home/gavin/Documents/MRes/features



In [2]:

    
import csv



In [3]:

    
ls









    



abundance.Entrez.full.txt@       head.training.nolabel.negative.Entrez.vectors.txt@
abundance.Entrez.traintest.txt@  pulldown.edges.Entrez.txt@
autogit.log                      pulldown.nolabel.Entrez.vectors.txt@
c2s.Entrez.full.txt@             training.nolabel.negative.Entrez.vectors.txt
c2s.Entrez.traintest.txt@        training.nolabel.positive.Entrez.vectors.txt



In [4]:

    
import sys



In [5]:

    
sys.path.append("/home/gavin/Documents/MRes/opencast-bio/")



In [6]:

    
import ocbio.string



In [7]:

    
import pickle



In [8]:

    
f = open("../string/human.Entrez.string.pickle")
stringfeatures = pickle.load(f)
f.close()



In [32]:

    
pulldownpairfile = open("../forGAVIN/pulldown_data/pulldown.interactions.Entrez.tsv")
stringedgefile = open("pulldown.string.edges.tsv", "w")
cp = csv.reader(pulldownpairfile, delimiter="\t")
cs = csv.writer(stringedgefile, delimiter="\t")
for l in cp:
    # for each pair index the feature dictionary
    # write the pairs that are non-zero
    pair = frozenset(l)
    combinedscore = float(stringfeatures[pair][-1])
    if combinedscore > 0.0000001:
        cs.writerow(l + [combinedscore])
pulldownpairfile.close()
stringedgefile.close()



In [33]:

    
!head pulldown.string.edges.tsv